Goto

Collaborating Authors

 image sensor


J3DAI: A tiny DNN-Based Edge AI Accelerator for 3D-Stacked CMOS Image Sensor

Tain, Benoit, Millet, Raphael, Lemaire, Romain, Szczepanski, Michal, Alacoque, Laurent, Pluchart, Emmanuel, Choisnet, Sylvain, Prasad, Rohit, Chossat, Jerome, Pierunek, Pascal, Vivet, Pascal, Thuries, Sebastien

arXiv.org Artificial Intelligence

Abstract--This paper presents J3DAI, a tiny deep neural network-based hardware accelerator for a 3-layer 3D-stacked CMOS image sensor featuring an artificial intelligence (AI) chip integrating a Deep Neural Network (DNN)-based accelerator . The DNN accelerator is designed to efficiently perform neural network tasks such as image classification and segmentation. This paper focuses on the digital system of J3DAI, highlighting its Performance-Power-Area (PPA) characteristics and showcasing advanced edge AI capabilities on a CMOS image sensor . T o support hardware, we utilized the Aidge comprehensive software framework, which enables the programming of both the host processor and the DNN accelerator . Aidge supports post-training quantization, significantly reducing memory footprint and computational complexity, making it crucial for deploying models on resource-constrained hardware like J3DAI. Our experimental results demonstrate the versatility and efficiency of this innovative design in the field of edge AI, showcasing its potential to handle both simple and computationally intensive tasks. Future work will focus on further optimizing the architecture and exploring new applications to fully leverage the capabilities of J3DAI. As edge AI continues to grow in importance, innovations like J3DAI will play a crucial role in enabling real-time, low-latency, and energy-efficient AI processing at the edge. The increasing adoption of intelligent vision systems in various domains, including the Internet of Things (IoT), healthcare, automotive applications, and smart surveillance, has led to an increasing demand for advanced image sensor technologies capable of real-time processing.


A Real-time Endoscopic Image Denoising System

Xing, Yu, Huang, Shishi, Lv, Meng, Chen, Guo, Wang, Huailiang, Sui, Lingzhi

arXiv.org Artificial Intelligence

Endoscopes featuring a miniaturized design have significantly enhanced operational flexibility, portability, and diagnostic capability while substantially reducing the invasiveness of medical procedures. Recently, single-use endoscopes equipped with an ultra-compact analogue image sensor measuring less than 1mm x 1mm bring revolutionary advancements to medical diagnosis. They reduce the structural redundancy and large capital expenditures associated with reusable devices, eliminate the risk of patient infections caused by inadequate disinfection, and alleviate patient suffering. However, the limited photosensitive area results in reduced photon capture per pixel, requiring higher photon sensitivity settings to maintain adequate brightness. In high-contrast medical imaging scenarios, the small-sized sensor exhibits a constrained dynamic range, making it difficult to simultaneously capture details in both highlights and shadows, and additional localized digital gain is required to compensate. Moreover, the simplified circuit design and analog signal transmission introduce additional noise sources. These factors collectively contribute to significant noise issues in processed endoscopic images. In this work, we developed a comprehensive noise model for analog image sensors in medical endoscopes, addressing three primary noise types: fixed-pattern noise, periodic banding noise, and mixed Poisson-Gaussian noise. Building on this analysis, we propose a hybrid denoising system that synergistically combines traditional image processing algorithms with advanced learning-based techniques for captured raw frames from sensors. Experiments demonstrate that our approach effectively reduces image noise without fine detail loss or color distortion, while achieving real-time performance on FPGA platforms and an average PSNR improvement from 21.16 to 33.05 on our test dataset.


Evaluation of Mobile Environment for Vehicular Visible Light Communication Using Multiple LEDs and Event Cameras

Soga, Ryota, Shiba, Shintaro, Kong, Quan, Kobori, Norimasa, Shimizu, Tsukasa, Lu, Shan, Yamazato, Takaya

arXiv.org Artificial Intelligence

In the fields of Advanced Driver Assistance Systems (ADAS) and Autonomous Driving (AD), sensors that serve as the ``eyes'' for sensing the vehicle's surrounding environment are essential. Traditionally, image sensors and LiDAR have played this role. However, a new type of vision sensor, event cameras, has recently attracted attention. Event cameras respond to changes in the surrounding environment (e.g., motion), exhibit strong robustness against motion blur, and perform well in high dynamic range environments, which are desirable in robotics applications. Furthermore, the asynchronous and low-latency principles of data acquisition make event cameras suitable for optical communication. By adding communication functionality to event cameras, it becomes possible to utilize I2V communication to immediately share information about forward collisions, sudden braking, and road conditions, thereby contributing to hazard avoidance. Additionally, receiving information such as signal timing and traffic volume enables speed adjustment and optimal route selection, facilitating more efficient driving. In this study, we construct a vehicle visible light communication system where event cameras are receivers, and multiple LEDs are transmitters. In driving scenes, the system tracks the transmitter positions and separates densely packed LED light sources using pilot sequences based on Walsh-Hadamard codes. As a result, outdoor vehicle experiments demonstrate error-free communication under conditions where the transmitter-receiver distance was within 40 meters and the vehicle's driving speed was 30 km/h (8.3 m/s).


Snapshot multi-spectral imaging through defocusing and a Fourier imager network

Yang, Xilin, Fanous, Michael John, Chen, Hanlong, Lee, Ryan, Costa, Paloma Casteleiro, Li, Yuhang, Huang, Luzhe, Zhang, Yijie, Ozcan, Aydogan

arXiv.org Artificial Intelligence

Multi-spectral imaging, which simultaneously captures the spatial and spectral information of a scene, is widely used across diverse fields, including remote sensing, biomedical imaging, and agricultural monitoring. Here, we introduce a snapshot multi-spectral imaging approach employing a standard monochrome image sensor with no additional spectral filters or customized components. Our system leverages the inherent chromatic aberration of wavelength-dependent defocusing as a natural source of physical encoding of multi-spectral information; this encoded image information is rapidly decoded via a deep learning-based multi-spectral Fourier Imager Network (mFIN). We experimentally tested our method with six illumination bands and demonstrated an overall accuracy of 92.98% for predicting the illumination channels at the input and achieved a robust multi-spectral image reconstruction on various test objects. This deep learning-powered framework achieves high-quality multi-spectral image reconstruction using snapshot image acquisition with a monochrome image sensor and could be useful for applications in biomedicine, industrial quality control, and agriculture, among others.


Conceptual Design on the Field of View of Celestial Navigation Systems for Maritime Autonomous Surface Ships

Wakita, Kouki, Hane, Fuyuki, Sekiguchi, Takeshi, Shimizu, Shigehito, Mitani, Shinji, Akimoto, Youhei, Maki, Atsuo

arXiv.org Artificial Intelligence

In order to understand the appropriate field of view (FOV) size of celestial automatic navigation systems for surface ships, we investigate the variations of measurement accuracy of star position and probability of successful star identification with respect to FOV, focusing on the decreasing number of observable star magnitudes and the presence of physically covered stars in marine environments. The results revealed that, although a larger FOV reduces the measurement accuracy of star positions, it increases the number of observable objects and thus improves the probability of star identification using subgraph isomorphism-based methods. It was also found that, although at least four objects need to be observed for accurate identification, four objects may not be sufficient for wider FOVs. On the other hand, from the point of view of celestial navigation systems, a decrease in the measurement accuracy leads to a decrease in positioning accuracy. Therefore, it was found that maximizing the FOV is required for celestial automatic navigation systems as long as the desired positioning accuracy can be ensured. Furthermore, it was found that algorithms incorporating more than four observed celestial objects are required to achieve highly accurate star identification over a wider FOV.

  Country:
  Genre: Research Report (1.00)
  Industry: Government > Military > Navy (0.60)

PixelGen: Rethinking Embedded Camera Systems

Li, Kunjun, Gulati, Manoj, Waskito, Steven, Shah, Dhairya, Chakrabarty, Shantanu, Varshney, Ambuj

arXiv.org Artificial Intelligence

Embedded camera systems are ubiquitous, representing the most widely deployed example of a wireless embedded system. They capture a representation of the world - the surroundings illuminated by visible or infrared light. Despite their widespread usage, the architecture of embedded camera systems has remained unchanged, which leads to limitations. They visualize only a tiny portion of the world. Additionally, they are energy-intensive, leading to limited battery lifespan. We present PixelGen, which re-imagines embedded camera systems. Specifically, PixelGen combines sensors, transceivers, and low-resolution image and infrared vision sensors to capture a broader world representation. They are deliberately chosen for their simplicity, low bitrate, and power consumption, culminating in an energy-efficient platform. We show that despite the simplicity, the captured data can be processed using transformer-based image and language models to generate novel representations of the environment. For example, we demonstrate that it can allow the generation of high-definition images, while the camera utilises low-power, low-resolution monochrome cameras. Furthermore, the capabilities of PixelGen extend beyond traditional photography, enabling visualization of phenomena invisible to conventional cameras, such as sound waves. PixelGen can enable numerous novel applications, and we demonstrate that it enables unique visualization of the surroundings that are then projected on extended reality headsets. We believe, PixelGen goes beyond conventional cameras and opens new avenues for research and photography.


UAV-borne Mapping Algorithms for Canopy-Level and High-Speed Drone Applications

Zhang, Jincheng, Wolek, Artur, Willis, Andrew R.

arXiv.org Artificial Intelligence

This article presents a comprehensive review of and analysis of state-of-the-art mapping algorithms for UAV (Unmanned Aerial Vehicle) applications, focusing on canopy-level and high-speed scenarios. This article presents a comprehensive exploration of sensor technologies suitable for UAV mapping, assessing their capabilities to provide measurements that meet the requirements of fast UAV mapping. Furthermore, the study conducts extensive experiments in a simulated environment to evaluate the performance of three distinct mapping algorithms: Direct Sparse Odometry (DSO), Stereo DSO (SDSO), and DSO Lite (DSOL). The experiments delve into mapping accuracy and mapping speed, providing valuable insights into the strengths and limitations of each algorithm. The results highlight the versatility and shortcomings of these algorithms in meeting the demands of modern UAV applications. The findings contribute to a nuanced understanding of UAV mapping dynamics, emphasizing their applicability in complex environments and high-speed scenarios. This research not only serves as a benchmark for mapping algorithm comparisons but also offers practical guidance for selecting sensors tailored to specific UAV mapping applications.


Fully Quantized Always-on Face Detector Considering Mobile Image Sensors

Lee, Haechang, Jeong, Wongi, Ryu, Dongil, Je, Hyunwoo, No, Albert, Kim, Kijeong, Chun, Se Young

arXiv.org Artificial Intelligence

Despite significant research on lightweight deep neural networks (DNNs) designed for edge devices, the current face detectors do not fully meet the requirements for "intelligent" CMOS image sensors (iCISs) integrated with embedded DNNs. These sensors are essential in various practical applications, such as energy-efficient mobile phones and surveillance systems with always-on capabilities. One noteworthy limitation is the absence of suitable face detectors for the always-on scenario, a crucial aspect of image sensor-level applications. These detectors must operate directly with sensor RAW data before the image signal processor (ISP) takes over. This gap poses a significant challenge in achieving optimal performance in such scenarios. Further research and development are necessary to bridge this gap and fully leverage the potential of iCIS applications. In this study, we aim to bridge the gap by exploring extremely low-bit lightweight face detectors, focusing on the always-on face detection scenario for mobile image sensor applications. To achieve this, our proposed model utilizes sensor-aware synthetic RAW inputs, simulating always-on face detection processed "before" the ISP chain. Our approach employs ternary (-1, 0, 1) weights for potential implementations in image sensors, resulting in a relatively simple network architecture with shallow layers and extremely low-bitwidth. Our method demonstrates reasonable face detection performance and excellent efficiency in simulation studies, offering promising possibilities for practical always-on face detectors in real-world applications.


OBSBOT Tiny 2 4K review: A stellar webcam gets a solid upgrade

PCWorld

The OBSBOT Tiny 2 PTZ 4K webcam improves upon its predecessor's stellar video and maintains its awesome array of configuration options. Other aspects, though, take a slight step back. We weren't sure how much the OBSBOT Tiny 2 PTZ 4K Webcam would improve over its predecessor. An improved image sensor and some unexpected AI voice controls and beauty enhancements, though, help justify the extremely high price. At an MSRP of $329, the OBSBOT Tiny 2 is arguably 10 times the price of some budget cameras in our roundup of the best webcams.


Emergent Visual Sensors for Autonomous Vehicles

Li, You, Moreau, Julien, Ibanez-Guzman, Javier

arXiv.org Artificial Intelligence

Autonomous vehicles rely on perception systems to understand their surroundings for further navigation missions. Cameras are essential for perception systems due to the advantages of object detection and recognition provided by modern computer vision algorithms, comparing to other sensors, such as LiDARs and radars. However, limited by its inherent imaging principle, a standard RGB camera may perform poorly in a variety of adverse scenarios, including but not limited to: low illumination, high contrast, bad weather such as fog/rain/snow, etc. Meanwhile, estimating the 3D information from the 2D image detection is generally more difficult when compared to LiDARs or radars. Several new sensing technologies have emerged in recent years to address the limitations of conventional RGB cameras. In this paper, we review the principles of four novel image sensors: infrared cameras, range-gated cameras, polarization cameras, and event cameras. Their comparative advantages, existing or potential applications, and corresponding data processing algorithms are all presented in a systematic manner. We expect that this study will assist practitioners in the autonomous driving society with new perspectives and insights.